Using Reinforcement Learning for Generating Polynomial Models to Explain Complex Data
نویسندگان
چکیده
Abstract Basic oxygen steel making is a complex chemical and physical industrial process that reduces mix of pig iron recycled scrap into low-carbon steel. Good understanding the ability to predict how it will evolve requires long operator experience, but this can be augmented with target prediction systems. Such systems may use machine learning learn model based on history, have an advantage in they make vastly more parameters than operators comprehend. While has become less challenge build such using algorithms, actual production implementations are rare. The hidden reasoning lack transparency prevents trust, even for models show high accuracy predictions. To express behaviour thereby increasing we develop reinforcement (RL) agent approach, which task generate short polynomials explain from what learnt data. RL rewarded well generates smaller subset parameters. Agent training done REINFORCE algorithm, enables sampling multiple concurrently plausible polynomials. Having polynomials, developers evaluate several alternative explanations, as observed historic presented approach gives both trained generative set process. performance good or better interpretable models. Further, relative simplicity resulting allows generalisation fit new instances best our evaluation achieves $$R^2$$ R2 score test comparison other evaluated.
منابع مشابه
a new approach to credibility premium for zero-inflated poisson models for panel data
هدف اصلی از این تحقیق به دست آوردن و مقایسه حق بیمه باورمندی در مدل های شمارشی گزارش نشده برای داده های طولی می باشد. در این تحقیق حق بیمه های پبش گویی بر اساس توابع ضرر مربع خطا و نمایی محاسبه شده و با هم مقایسه می شود. تمایل به گرفتن پاداش و جایزه یکی از دلایل مهم برای گزارش ندادن تصادفات می باشد و افراد برای استفاده از تخفیف اغلب از گزارش تصادفات با هزینه پائین خودداری می کنند، در این تحقیق ...
15 صفحه اولDo Reinforcement Learning Models Explain Neural Learning?
Because the functionality of our brains is still a blank sheet, this paper shall take a glimpse at what possibilities Reinforcement Learning has offered the for finding a few parts to fill it up in last years. It will also be illuminated partly how close or far away of reality these approaches are. Therefore we first reflect the most successful idea, the temporal difference reward prediction hy...
متن کاملMachine Learning Models for Housing Prices Forecasting using Registration Data
This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...
متن کاملGenerating Student Feedback from Time-Series Data Using Reinforcement Learning
We describe a statistical Natural Language Generation (NLG) method for summarisation of time-series data in the context of feedback generation for students. In this paper, we initially present a method for collecting time-series data from students (e.g. marks, lectures attended) and use example feedback from lecturers in a datadriven approach to content selection. We show a novel way of constru...
متن کاملReinforcement Learning in Parameterized Models Reinforcement Learning with Polynomial Learning Rate in Parameterized Models
We consider reinforcement learning in a parameterized setup, where the model is known to belong to a finite set of Markov Decision Processes (MDPs) under the discounted return criterion. We propose an on-line algorithm for learning in such parameterized models, the Parameter Elimination (PEL) algorithm, and analyze its performance in terms of the total mistakes. The algorithm relies on Wald’s s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SN computer science
سال: 2021
ISSN: ['2661-8907', '2662-995X']
DOI: https://doi.org/10.1007/s42979-021-00488-w